Analysis of Belief Propagation for Non-Linear 

Problems: 

The Example of CDMA (or: How to Prove 

Tanaka's Formula) 
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Abstract — We consider the CDMA (code-division multiple- 
access) multi-user detection problem for binary signals and 
additive white gaussian noise. We propose a spreading sequences 
scheme based on random sparse signatures, and a detection 
algorithm based on belief propagation (BP) with linear time 
complexity. In the new scheme, each user conveys its power onto 
a finite number of chips /, in the large system limit. 

We analyze the performances of BP detection and prove that 
they^ coincide with the ones of optimal (symbol MAP) detection in 
the I oo limit. In the same limit, we prove that the information 
capacity of the system converges to Tanaka's formula for random 
'dense' signatures, thus providing the first rigorous justification 
of this formula. Apart from being computationally convenient, 
the new scheme allows for optimization in close analogy with 
irregular low density parity check code ensembles. 



I. Introduction 



A. Motivation 



The crucial new characteristics of modern (iterative) cod- 
ing systems [1] are: {i) Probabilistic construction based on 
sparse random graphs; (ii) Iterative (belief propagation, BP) 
decoding; {Hi) Focus onto the large system limit. Despite 
their generality, the impact of these principles outside the 
area of linear error correcting codes has been limited. It is 
therefore extremely interesting to extend their scope to other 
communications and information theory problems'. 

The tools developed for the analysis of iterative coding 
systems must be considerably strenghtened in order to cope 
with such generalizations. Consider for instance the question 
of whether BP decoding is asymptotically optimal (in the large 
system limit), i.e. if it implements symbol MAP decoding. For 
LDPC codes, density evolution (DE) allows to show that this 
is the case if the noise level is smaller than a threshold, below 
which the asymptotic BP bit error rate Pj^^ vanishes. When 
P{^^ > (as we expect in a general setting), one cannot say 
much about MAP performances, and their relation to BP (apart 
from the obvious sub-optimality of BP). 

' An earlier example that support this view is the use of low density codes 
with non-linear checks for lossy data compression in [2]. 
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Recently, some definite progress was made on these prob- 
lems in the context of LDPC codes [3], [4], [5]. The basic 
new ingredient is a 'general area theorem' that yields the rate 
of change of the mutual information across the system, under 
a change in the channel parameter Earlier examples of such a 
relation were found by Ashikhmin, Kramer , and ten Brink [6] 
(for the erasure channel), and Guo, Shamai and Verdii [7], [8] 
(for the gaussian and Poisson channels). The approach based 
on the area theorem seems rather general. In order to illustrate 
it, and further explore its capabilities, we consider here a new 
application: multi-user detection [9]. 

B. Multi-user detection with binary inputs 

In a simple multi-user detection scenario, each of K users 
transmits a symbol e M to a common receiver, after 
encoding it using a signature Sj e M.^. The received signal is 



(1) 



where the noise w is a vector of N i.i.d. gaussian variables 
of mean and variance a^. The input symbols Xi are also 
modeled as i.i.d.'s. Writing § for the N x K matrix with 
columns s^, . . . , s^, and x — {xi, . . . , xk)^ for the input, 
the above equation can also be written y = S x + w. Of great 
interest is the large system limit N, K oo with K/N — a 
fixed. 

How reliably can the input x be reconstructed given y and 
the signature matrix §? In order to answer this question, the 
signatures Sj are usually taken to be i.i.d. random vectors. 
The standard choice is to set Sj = ■ ■ ■ ' ^i^)'^ where 

the are i.i.d. with zero mean and unit variance (we will 
call these 'dense signatures'). Tse and Hanly [10], and Verdii 
and Shamai [11] considered the case in which the input 
symbols Xi are gaussian random variables. Using random 
matrix theory, they were able to compute the minimum mean 
square error, and the information capacity of the system. In 
[12], we considered a multi-user detection algorithm based on 



BP, and proved it to be optimal (i.e. to implement minimum 
mean square error detection) with high probability in the large 
system limit. 

The case of binary input symbols Xi £ {+1,-1} uniformly 
at random, is of obvious interest for practical applications, 
and out of reach of classical methods (such as random matrix 
theory). Tanaka [13] used the replica method from statistical 
physics in order to determine the asymptotic information 
capacity. More precisely, let us define per-user conditional 
entropy h = liniK^oo K^-^E,H{X\Y), where the expec- 
tation is taken with respect to the random signatures and 
throughout the paper we measure entropies in nats (obvi- 
ously I{X;Y) = inog2 - H{X\Y)). He obtained h = 
/iRs(cr^, a) SE sup h^s{q; cr^, a), where 



a= 1 2 



N 



(2) 



Ks{q;cr^,a) = log 2 cosh{X{q) + X{q) z) - 

\{q) = [a'^ + a{l — q)]^^ , and Ez denotes throughut the paper 
expectation with respect to the standard normal variable z. It 
is easy to show that the value of q maximizing /irs('z; tr^, a) 
must satisfy the stationarity condition 

q = E, tanh2(A((7) + z) . (3) 

Unhappily, the replica method is non-rigorous. In this paper 
we will prove Tanaka's formula for a < ofg « 1.49 (a precise 
definition of is provided in the next Section). For earlier 
applications of BP to multi-user detection with binary signals, 
we refer, for instance to [14], [15], [16]. We will prove that, 
in the same regime a < a^, optimal (symbol MAP) detection 
can be implemented using BP. 

In order to prove these results, we will introduce a new 
'sparse signature' scheme, see Section HI] and view standard 
dense signatures as a limiting case. The identity between 
the two limiting procedures will be the object of a separate 
publication. The new scheme (which is reminiscent of LT 
codes [17]) is on the other hand interesting per se. It allows 
to implement BP in a very natural way with complexity linear 
in N . Furthermore, it opens the way to optimization of the 
degree sequence thus improving the performances over dense 
signatures. We refer to Section IIVI for numerical indications 
in this direction. 

II. The sparse signature scheme, and main results 

A. Sparse signatures and belief propagation 

As already mentioned, in order to prove Tanaka's formula 

we shall introduce a new signature scheme. This is caracterized 

by a distribution {f2; : / > 0} over the non negative integers 

(to avoid pathological behaviors, we assume it to have bounded 

support). We also let 7 > be its mean and define lui = IVti/l 

for / > 0. The user i constructs her signature Si independently 

from the other users as follows. She chooses an integer I from 

the distribution VLi, and a subset di of {1,...,A^} of size 

\di\ = I uniformly at random among the (^) such subsets. Her 

signature is Si = -^{sn, . . . , SiN)'^ where sia S {+1,-1} 
V ' 

uniformly at random if a G di, and Sia = otherwise. 




i= 1 2 



Fig. I. Factor graph representation of the sparse signature scheme: circles 
represent users (variable nodes) and squares chips (function nodes). 



Notice that the normalization ensures that the average power 
employed by each user is equal to 1 as for the dense signature 
scheme. However this power is conveyed onto a finite number 
of chips. Viceversa, each chip a S {1, . . . , N} receives power 
from a finite number of users, to be denoted as da (this is 
the set of i G {1, ... ; such that a G di). The conditional 
distribution of the input symbols, given the received signal y 
take the form 



1 ^ 



exp 




(4) 



(5) 



Such distribution is conveniently represented through the asso- 
ciated factor graph, cf. Fig.^ This includes K variable nodes 
(one for each user i), N function nodes (one for each chip a) 
and an edge joining user i and chip a whenever i G da. 

If signatures are chosen according to the proposed scheme, 
the resulting factor graph is a sparse random graph. The 
degree distribution is VLi on the variable node (user) side, 
and converges to a Poisson distribution with mean la on the 
function node (chip) side. 

BP is introduced in the standard way: we limit ourselves to 
writing down the update equations in terms of log-likelihoods^. 
Two types of messages are updated: variable to function 
node, Vi^a, and function to variable node, Ua^i. The update 
equations read 

^Ua = E "b-' ^6) 

'^l-^i = f{vj^a,Sja,j e da\i; s.,a; Va) , (7) 
where the index t denotes the iteration number and 



1 



,vk,sk; so; y) = 3 ^"^-^ , 



(8) 



(9) 



0. 



We furthermore adopt the initial condition 
After a fixed number of iterations, all the messages incoming at 
variable node i are combined to compute the decision x^'^ = 

sign{Eae9» 

-More precisely, we use here one half of log-likelihoods. 



B. Main results 

In order to state and prove our main results more easily, it 
is convenient to focus onto 'Poisson' signature schemes. By 
this we mean that > 0} is a Poisson distribution of 

mean I. We shall come back to the general case in Sections 
IIII-AI and II VI Within this setting, we consider the expected 
conditional entropy per user E H{X\Y) / K (the expectation 
being taken with respect to the random signatures). Since 
we do not know a priori whether the large system limit 
exists, we define h{a'^,a,l) = limsup^_,o2 K H{X\Y) / K, 
and h{a^,a,l) = liminfjv^oo EH{X\Y)/K. In both cases, 
the limit is taken keeping the ratio K/N = a fixed. 

If we let 7 ^ iV and then oo, we would recover 

the standard dense signature scheme (strictly speaking this 
corresponds to fti concentrated on I = N). Here we shall 
invert the order of the two limits and let ^ oo and then 
I ^ oo afterwards. Our first result shows that, if the limit 
is taken in this way, Tanaka formula is correct. For our proof 
technique to work a must be smaller than the 'spinodal value' 
as. This is the largest number such that, for any a < as the 
solution to Eq. Q is unique for all cr^ S [0, oo), and is a 
differentiable function of cr^. By solving Eq. Q numerically, 
we get tts ~ 1.49. 

Theorem 1: If a < a^, then the per-user conditional en- 
tropy converges to Tanaka's formula in the dense signature 
limit 



lim /i((T^, a, I) — lim /i(ct^, a, I) — ft-Rs(cr^, a) 



(10) 



The hypothesis of Poisson signatures is presently used only 
in the proof of Lemma [2 It shouldn't however be difficult 
to extend this result to more general sequences of degree 
distributions VLi. 

A key step in the proof of the above result consists in anal- 
izing the BP-based detection algorithm defined by Eqs. (|6j, 
(0. Our second result shows that, in the small a regime this 
algorithm is indeed optimal (the proof of this result is deferred 
to a longer paper). 

Theorem 2: Let Pb (I, N) be the expected bit error rate un- 
der symbol MAP detection, and P^''(I, N] t) the same quantity 
for t iterations BP detection. Define the asymptotic BP error 
overhead as 



A(7; t) = limsup[P«b-(7, N- 1) - Pb(7, N)] . 



(11) 



If Of < as, then BP is optimal in the dense signature limit, 
namely limt^oo limi^oo ^(^5 ^) = 0- 

III. A SKETCH OF THE PROOF 

A. A few simple remarks 

We start by collecting a few remarks whose proof is routine, 
and therefore omitted apart from a few hints. 

All +1 input. For the sake of analysis (and for proving 
Theorem [TJl we can assume that the input signal is a; = 



x+ = (+1,. 



-1)^. In particular, if we write 



for 



to X = x+, then ¥.^H{X\Y) = -E^ y s\ogf{X\Y,E) = 

-E+giogP(x-x+|r,§). 

Density evolution. (T)Yi) Any finite neighborhood of a ran- 
domly chosen node in the factor graph associated to the sparse 
signature scheme, converges in distribution to a tree with the 
degree distribution mentioned above. As a consequence, the 
messages distribution can be analyzed through a standard DE 
approach. 

Define the sequence of random variables {w*, m*; i > 0} as 







follows: u' 



6=1 



0, and 



>vl,Sk; so;y) , (12) 



for t > 0. Here = denotes identity in distribution; u\,U2, . ■ ■ 
(respectively, v{,V2, ■ ■ ■) are i.i.d. copies of u* (respectively, 
of u*); / is an integer random variable with distribution uii, 
and fc is a Poisson random variable with mean la; finally 
So, . . . ,Sk are i.i.d.'s with Si e { + 1,-1} uniformly at random, 
y ^ X]i=o + ""^ with w a normal random variable with 

mean and variance cr^. 

Let [ia) be a uniformly random edge in the factor graph 
and the corresponding BP messages, under the 

assumption that x+ has been transmitted. Then v\^^ (respec- 
tively u\^j) converges in distribution to (respectively, to 
u*) as TV ^ CX3. 

Symmetry condition. A random variable X is 'symmetric' if 
E[/(-X)] =E[e-2^/(X)] for any function / such that both 
expectation exist. It is easy to show that the random variables 
w*, defined above are symmetric (this is analogous to what 
happens in LDPC codes). 

Area theorem. Following [7], the derivative, with respect to 
the noise parameter, of the conditional entropy is proportional 
to the expectation of the conditional variance 

AH{X\Y) _ 1 



- — E, {Var(§X|r)} 



(13) 



Let us take the expectation with respect to the signatures S, and 
normalize by the number of users. Using the all +1 assump- 
tion, we get (derivative and expectation can be interchanged 
because }i(X\Y) has positive bounded derivative, see below) 

1 AEH{X\Y) _ 1 



K 



dCT2 



(14) 



1 



Nli 



\da\- 



the joint expectation with respect to y and §, conditional 



where Xi — Xi{Y,S) = E,[Xi\Y,S]. We shall sometimes refer 
to the right hand side as to the GEXIT function and denote 
it by gN{a,<7'^). From the above expressions it is easy to 
realize that < gM{a,a'^) < l/2a^. The same inequalities 
also hold at fixed S, which justifies the exchange of derivative 
and expectation above. 

As in Refs. [3], [4], [5], we introduce furthermore the BP 
GEXIT function glf{a, cr^), with t a non-negative integer This 
is defined by replacing the expectation J^ieda ^iaXi on the 



right hand side of Eq. il4\ by its estimate after t iterations of 
BP (in the N ^ oo Umit). In terms of the DE variables 



1 



2aHa 



-E{k- 



(15) 



where ( • ) denotes an average over G { + 1,-1} with 
distribution 

k 



(16) 



and the expectation E is taken with respect to {vf} (i.i.d. and 
distributed as from DE), {si} (i.i.d. uniform in {+1, —1}), 
w (gaussian with mean zero and variance tr^), and k (Poisson 
with mean la). 

B. The proof 

The proof of Theorem^makes use of three lemmas, which 
we state without demonstration for lack of space. As in 
Section Hl-BI h{a,a'^,l) and h{a,a'^,l) denote, respectively, 
the limsup and liminf of the expected conditional entropy 
per bit, in the system with Poisson signatures. 

The first lemma states that, in the low noise limit, the input 
can be reconstructed faithfully from the transmitted message 
and therefore the conditional entropy per bit vanishes (recall 
that we are dealing with discrete inputs). 

Lemma 1: For any a > 0, lim„2^Qlhnj^^h{a,a'^,l) = 

0. 

The proof is based on a union bound, and a combinatorial 
calculation. 

The second lemma provides upper and lower bounds on 
the conditional entropy per user, in terms of BP GEXIT 
functions. For the sake of definiteness, we state the lemma 
for Poisson signatures (and denote the corresponding BP 
GEXIT functions as glp{a, a'^,J)) although it obviously holds 
in greater generality [5]. 

Lemma 2: For any I > 0, CTq > 0, and non-negative integer 



p(a, tr' ,1) da' <h{a,a^,l)< 



(17) 



< h{a,a'^,l) < h{a,aQ,l) + / glp{a,a' ,1) da' 



This is in fact an easy consequence of the general result that 
GEXIT functions preserve physical degradation [5]. 

Finally, a Lemma on the large I limit of DE. 

Lemma 3: Define the sequence {A*; < > 0} by setting Aq = 
and 



A 



t+i 



I — Ez tanh {Xt + \/ Xt z 



(18) 



for any t > 0. Let {u*; t > 0} he the solution of DE for the 
system with Poisson signatures (with mean I) and the same 
values of a"^ and a. Then, for any t > 0, converges in 
distribution to a gaussian random variable with mean At and 
variance Af as I — cxd. 




0.125 



Fig. 2. The bit error rate as a function of the noise parameter cr at a = 1.3. 
The bold continuous line is Tanaka's result for dense signatures under symbol 
MAP detection. MF refer to the same signature scheme under matched filter 
detection. The other (dashed) lines correspond to sparse signatures and BP 
detection. 



The proof is based on a repeated application of the central limit 
theorem (the argument can be written as an induction over t). 
The reader is invited to try, for instance, with t = 1, 2, . . . . 

Let us now turn to the proof of Theorem ^ We start by 
using Lemma to compute the large I limit of the BP GEXIT 
functions. After a simple application of central limit theorem, 
we get 



1 



_lim 53p(a, a ,1) = — 



2cr2 (j2 + a(l-gt) 



(19) 



where qt = tanh (At + vAt-^)- We shall denote the 
expression on the right hand side of Eq. ( I19> as glp{a, cr^). 

Next, we use Lemma |2 Noticing that < glp{a,a'^ ,1) < 
l/ia'^ we can apply the dominated convergence theorem to 
take the 7 — > cx) limit in Eq. illi . If we take ctq ^ afterwards 
and apply Lemma ^ we get 



1 - / gl{a, a'^) da'^ < h{a, a^,oo) < 



<h{a,a ,oo)< / gBeia,a' ) da' , 
Jo 



(20) 



where h{a, a'^ ,oo) 



lim infj^^ ll{a, a'^,1), and 



h{a,a ,oo) = lim supj^^ /i(a, u , Z). 

Simple calculus shows that A* is strictly positive and in- 
creasing in t for t > 1, and At ~ a^"^ as cr ^ 0. Furthermore 
linit^oo — Abp is the smallest positive fixed point of the 
recursion ( I18> . i.e. the smallest positive solution of Tanaka's 
stationarity equation (|3}- 

From these remarks, it follows that g*p(a,(T^) is integrable 
over a E [0, oo) and strictly decreasing in t > 1. We can 




the crucial ingredient allowing for low complexity detection 
and close-to-optimal performances. 



0.125 



Fig. 3. Same as in Fig. |2|but for a = 1.9. The S shaped dashed curve is 
the analytical continuation of the bit eiTor rate for dense signatures. 



therefore take the t ^ oo limit of Eq. \2Q\ to get 

/"OO 

1— / 5Bp(Q^,cr' ) da' < h{a, cr^ , oo) < 



(21) 



where we defined 

5Bp(a, a ) = hm g^,{a, cf ) = — ^- , 

and (7bp — tanh(ABP + ^/X^z). 

We are left with the task of showing that the first and 
the last expressions in Eq. (I21> do indeed coincide and are 
both equal to Tanaka's formula hps{a,(j'^). Recall that, for 
a < as, the stationarity equation (I20> admits a unique solution 
depending smoothly on cr^. Furthermore, we saw above that 
this coincides with the BP fixed point. Using these remarks, 
we can differentiate Eq. with respect to a^, to get 



{a, (J 



(22) 



The proof is completed by applying the fundamental theorem 
of calculus to Eq. i2l\ and noticing that /iiis(Q;,0) = and 
Ksia, oo) = 1. □ 
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IV. NUMERICAL SIMULATIONS 

One may wonder how quickly is the 7 — > cx) limit in 
Theorems and |2l attained. In Fig. |2l we show the results 
of numerical simulations using DE, and regular signatures (fJ/ 
concentrated on a single value), for a = 1.3 < as- Akeady at 
I = 4 the bit error rate is extremely close to the dense limit! 

Even more surprising is the behavior for a > ag. In Fig.|3l 
we show the data for a = 1.9. The BP error rate at / = 4 
is close to the MAP one with dense signatures. However it 
worsens at I grows (and seems to approach the natural guess 
for BP behavior with dense signatures). Sparse signatures are 



